Skip to content

Conversation

RustyYato
Copy link

@RustyYato RustyYato commented Aug 5, 2025

Add repr(ordered_fields) and provide a migration path to switch users from repr(C) to repr(ordered_fields), then change the meaning of repr(C) in the next edition.

This RFC is meant to be an MVP, and any extensions (for example, adding more reprs) are not in scope. This is done to make it as easy as possible to accept this RFC and make progress on the issue of repr(C) serving two opposing roles.

Rendered

To avoid endless bikeshedding, I'll make a poll if this RFC is accepted with all the potential names for the new repr. If you have a new name, I'll add it to the list of names in the unresolved questions section, and will include it in the poll.

@clarfonthey
Copy link

clarfonthey commented Aug 5, 2025

Not to add too many extra colours to the list, but repr(consistent) feels like a good name for this, since the purpose is to provide a consistent layout that does not depend on generics, compiler version, or target. The important thing is just that it's consistent, not that it matches what C does.

(Note: those three things should cover every case I've seen that uses repr(C) that should use repr(ordered_fields), but please feel free to correct me if I missed anything.)

Whereas repr(C) is explicitly, match what C does.

Also, while it may be more technical than most users need to understand, it would be helpful if the RFC reiterated the current issues with repr(C) that we want to fix, and potential future differences between repr(C) and repr(ordered_fields) that could pop up. I've read some of them but am not 100% sure of the details, and it would be nice to keep as part of the RFC.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 5, 2025

Just as a small point of style the Guide Level Explanation is usually "what would be written in the rust tutorial book", and the Reference Level Explanation is "what could be written into the Rust Reference". This isn't a strict requirement, but personally I'd like to see the Reference Level part written out. Using the present tense, as if the RFC was accepted and implemented.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout.
`repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in-memory layout.

"cross-platform" -- the layout will differ when there are different layouts for struct members' types, in particular primitive types can have different alignments which changes the amount of padding.

e.g., #[repr(ordered_fields)] struct S(u8, f64); doesn't have the same layout on x86_64 and i686

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this will need to be documented as a hazard in the ordered_fields docs. However, the repr itself will be cross-platform. For example, #[repr(ordered_fields)] struct Cross([u8; 3], SomeEnum); will be truly cross-platform (given that SomeEnum is!).

RustyYato and others added 3 commits August 5, 2025 16:18
@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 6, 2025
@moonheart08
Copy link

Not to add too many extra colours to the list, but repr(consistent) feels like a good name for this, since the purpose is to provide a consistent layout that does not depend on generics, compiler version, or target. The important thing is just that it's consistent, not that it matches what C does.

(Note: those three things should cover every case I've seen that uses repr(C) that should use repr(ordered_fields), but please feel free to correct me if I missed anything.)

Whereas repr(C) is explicitly, match what C does.

Also, while it may be more technical than most users need to understand, it would be helpful if the RFC reiterated the current issues with repr(C) that we want to fix, and potential future differences between repr(C) and repr(ordered_fields) that could pop up. I've read some of them but am not 100% sure of the details, and it would be nice to keep as part of the RFC.

Just voicing support for repr(consistent) as naming.
Aside from the above, it more clearly hones in on the primary promises of the RFC, which is not just ordering but also exact type representation for things like enums. Field ordering is not the only thing it promises.

@joshtriplett joshtriplett added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Aug 6, 2025
@joshtriplett
Copy link
Member

Nominating this so that we can do a preliminary vibe-check on it in a lang triage meeting.

Comment on lines +14 to +16
Currently `repr(C)` serves two roles
1. Provide a consistent, cross-platform, predictable layout for a given type
2. Match the target C compiler's struct/union layout algorithm and ABI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big fan of doing this split, especially for structs. (It's less obvious what choices to make for other things, IMHO, but at least for structs this is something I've wanted for ages, so that for example Layout::extend can talk about it instead of C.)

Pondering the bikeshed: declaration_order or something could also be used to directly say what you're getting.

(This could be contrasted with other potential reprs that I wouldn't expect this RFC to add, but could consider as future work, like a deterministic_by_size_and_alignment where some restricted set of optimizations are allowed but you can be sure that usize and NonNull<String> can be mixed between different types while still getting the "same" field offsets, for example.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is also useful for unions, so we don't need to rely on repr(C) to ensure that all fields of a union are at offset 0.

This could be contrasted with other potential reprs that I wouldn't expect this RFC to add...

This also works as an argument against names like repr(consistent), since there are multiple consistent and useful repr, making it not descriptive enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that declaration_order or ordered_fields is a bit weird on a union, because of course they're not really in any "order".

It makes me ponder whether we should just have repr(offset_zero) for unions to be explicit about it, or something.

(Which makes me think of other things like addressing rust-lang/unsafe-code-guidelines#494 by having a different constructs for "bag of maybeuninit stuff that overlap" vs "distinct options with an active-variant rule for enum-but-without-stored-discriminant". But those are definitely not this RFC.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind spelling it as repr(offset_zero) for unions if that helps get this RFC accepted 😄. However, I have a sneaking suspicion that this isn't the contentious part of this RFC.
I know the name isn't optimal (intentionally). This can be hashed out after the RFC is accepted (or even give a different name for all of struct, union, and enum).
The most important bit for me is just that we do the split (for all of struct, union, and enum, to be consistent).

@RustyYato
Copy link
Author

I've updated how enums's tags are specified, now they just defer to whatever repr(C)'s tag type is. This is done to reduce the friction of switching from repr(C) to repr(ordered_fields). To ensure that all uses of repr(ordered_fields) can be cross-platform, I've adding a lint to ensure that the user also adds an explicit repr for repr(uN)/repr(iN).


`repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields.

Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as a future compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too noisy. Most code out there using repr(C) is probably fine - IIUC, if you're not targeting Windows or AIX, maybe definitely fine? - and having a bunch of allow(...) across a bunch of projects seems unfortunate.

Maybe we can either (a) only enable the lint for migration, i.e., the next edition's cargo fix would add allows for you or (b) we find some new name... C2 for the existing repr(C) usage to avoid allows. But (b) also seems too noisy to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could just be an optional edition compatibility lint, so if someone enables e.g. rust_20xx_compatibility it shows up but otherwise not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a) only enable the lint for migration

That was the intention, hence the name edition_2024_repr_c. I'll make this more clear, that this is intended to be a migration lint.

Rustfix would update to #[repr(ordered_fields)] to preserve the current behavior. For the FFI crates, #![allow(edition_2024_repr_c)] at the top of lib.rs would suffice. If you have a mix of FFI and non-FFI uses of repr(C), then you'll have to do the work to figure out which is which, no matter what option is chosen to update repr(C) - even adding repr(C2), since then the FFI use case would need to update all their reprs to repr(C2).

Overall, I think this scheme only significantly burdens those who have a mix of FFI and non-FFI uses of repr(C). But they were going to be burdened no matter what option was chosen.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the new wording/lints too noisy still?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new wording is still too noisy. We shouldn't assume that most people using repr(C) are using it for ordering rather than FFI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wasn't my intention, but I don't see another way to do all of the following in the next edition:

  • make repr(C) mean - same layout/ABI as what the standard C compiler does
  • make repr(ordered_fields) - the same algorithm that's listed for repr(C) in the Rust reference
  • ensure that everyone who upgrades to the next edition gets the layout they need (as long as they read the warnings and follow the given advice)
  • make it as painless as possible for people who don't mix FFI and stable ordering cases (which I suspect is the vast majority of people). In other words, each crate currently uses repr(C) either exclusively for FFI or exclusively for some stable layout.
  • for people who do mix FFI and stable ordering cases in one crate, at least the warning should give them all the places they need to double-check, and they can silence the warning on a case-by-case basis.

I'm open to suggestions on how to handle the diagnostics. Within these constraints, I think my solution is the only real option we have. If there are some objections to these constraints, I would like to hear those too, maybe I missed the mark with these constraints, and missed a potential solution because of it.

Comment on lines 114 to 120
## `repr(ordered_fields)`

> The `ordered_fields` representation is designed for one purpose: create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type
>
> This representation can be applied to structs, unions, and enums.
>
> - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think would be nice to guarantee that if a repr(ordered_fields) ADT is layout-compatible (same size, alignment, and field offsets, discounting repr(Rust) ZST fields) with a struct or union defined in C, then it is also ABI-compatible with that C struct/union, as far as possible. This would allow using repr(ordered_fields) to interoperate with C code that defines types with weird pragmas, like AIX #pragma align (natural).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible, and I've listed something similar as an unresolved question (should repr(ordered_fields) have a well-defined ABI?).

However, I think it would be best to punt this to a future RFC/ACP/discussion/etc. It doesn't seem to be required to split repr(C). Matching a weird pragma on a Tier 3 platform is not enough motivation to add to this RFC. (As I would like to keep this as lean as possible, only keeping the necessary portions)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we currently do have an internal linear repr which is used for Box and that does not inhibit ABI optimizations, that's necessary for box to act as a pointer type.

Comment on lines 51 to 53
This also plays a role in [#3718](https://github.com/rust-lang/rfcs/pull/3718), where `repr(C, packed(N))` wants allow fields which are `align(M)` (while making the `repr(C, ...)` struct less packed). This is a footgun for normal uses of `repr(packed)`, so it would be better to relegate this strictly to the FFI use-case. However, since `repr(C)` plays two roles, this is difficult.

By splitting `repr(ordered_fields)` off of `repr(C)`, we can allow `repr(C, packed(N))` to contain over-aligned fields (while making the struct less packed), and (continuing to) disallow `repr(ordered_fields, packed(N))` from containing aligned fields. Thus keeping the Rust-only case free of warts, without compromising on FFI use-cases.
Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something you intend to do as part of this RFC, or is it a future possibility? If the former, you should specify the exact behavior in the RFC. If the latter, it should be mentioned in the future possibilities section.

Also, what is the motivation for “(continuing to) disallow repr(ordered_fields, packed(N)) from containing aligned fields”? Should we not simply have the packed take precedence?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think there's a pretty obvious semantics for repr(ordered_fields, packed(N)) with repr(aligned) fields -- treat those fields like all other fields, i.e., cap their alignment to N. Not sure if that's worth punting to a future possibility.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something you intend to do as part of this RFC, or is it a future possibility? If the former, you should specify the exact behavior in the RFC. If the latter, it should be mentioned in the future possibilities.

This is a reference to another RFC, I'll make this more clear.

Also, what is the motivation for “(continuing to) disallow repr(ordered_fields, packed(N)) from containing aligned fields”? Should we not simply have the packed take precedence?

This is the status quo, so no motivation needed. You would need some motivation to switch to packed taking precedence.

Copy link
Author

@RustyYato RustyYato Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think there's a pretty obvious semantics for repr(ordered_fields, packed(N)) with repr(aligned) fields -- treat those fields like all other fields, i.e., cap their alignment to N. Not sure if that's worth punting to a future possibility.

I would rather not scope creep this RFC to one that's too large and controversial to accept. Given that there is already a whole other RFC for exactly this case, I don't think this is a minor addition.

So, I'm trying to keep this RFC as small as possible. So, since it is currently a hard error, I'm keeping it like that. But as a compromise, I will add it as a unresolved question, so we don't lose track of this. If it is not controversial and won't hamper accepting this RFC, I'm happy to add it in.

Copy link
Member

@RalfJung RalfJung Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what the lang team says; they often like fixing some answer for those questions to make it more concrete what the RFC could turn into (and this one seems rather uncontroversial to me).

Note that we anyway have to define what the layout is for a repr(ordered_fields, packed) with a field of generic type, even if that generic type ends up instantiated with a repr(align) type. There's no way we can skip that question (except by forbidding generic types in repr(ordered_fields, packed), or by adding entirely new type system concepts to prevent this instantiation from happening -- neither of which is or should be suggested by the RFC). The current hard error is kind of a red herring, it doesn't actually reliably rule out repr(align) nested inside repr(packed).

>
> - edited version of the [reference](https://doc.rust-lang.org/stable/reference/type-layout.html#the-c-representation) on `repr(C)`

The exact algorithm is deferred to whatever the default target C compiler does with default settings (or if applicable, the most commonly used settings).
Copy link
Member

@RalfJung RalfJung Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should repr(C) do when the corresponding C code is rejected by the default target C compiler?

This affects, for instance, fieldless structs, which are rejected by MSVC (but accepted by GCC). By extension it then also affects structs where all fields are PhantomData, as those should arguably be removed when translating a Rust type to a C type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specify that repr(Rust) or repr(ordered_fields) 1-ZSTs (incl. tuples) are always removed, but otherwise if there is no direct C equivalent (e.g. repr(Rust) or repr(ordered_fields) fields, or fieldless struct if not supported by the C compiler), the layout should be consistent within the same compiler version but otherwise unspecified.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this as an unresolved question, so we don't lose track of this.

I agree that we should remove any repr(Rust)/repr(ordered_fields) 1-ZSTs, but if we have a type that otherwise doesn't have an equivalent supported C type, maybe we should either error or try working with the C community to close this gap (i.e. fieldless types). Either by ensuring that they never make it compile in the future and giving us free rein to give semantics, or by making the code compile with some layout/ABI and then matching it ourselves (this would be preferred).

If the type has no equivalent (i.e. some repr(Rust)/repr(ordered_fields)), then either the layout is unspecified or we could treat it some opaque blob with the same size and align, and lay it out how a C compile would lay out an struct containing an array of the same size and align. (i.e. __attribute__((aligned(ALIGN))) struct Blob { char x[SIZE] };). This seems like it would be the most consistent.

We've been bitten before by making choices about what the C code ought to do and had to do some painful roll-backs (like this RFC, and env::set_var). So I think being more conservative around C code is probably a good thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we could treat it some opaque blob with the same size and align, and lay it out how a C compile would lay out an struct containing an array of the same size and align. (i.e. attribute((aligned(ALIGN))) struct Blob { char x[SIZE] };)

I don't think we should promise any particular layout -- by leaving it unspecified, we at least de jure can change what we do to match what C does if it ever becomes legal in C.


Even though on AIX `__alignof__(double)` is 8, it is still laid out an a 4-byte boundary.

This is in stark contrast with `repr(C)` in Rust, which always lays out fields at their "natural alignment". Any fix for this would require splitting up `repr(C)` since anyone in case 2 cannot tolerate under-aligned fields (since it would disallow taking references to those fields).
Copy link
Member

@RalfJung RalfJung Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust, which always lays out fields at their "natural alignment"

No we don't? At least, "natural alignment" often means "align == size", and that's not what we do.

We always lay out fields at their required alignment. Maybe you meant that by "natural", but given that the term is ambiguous, let's better avoid it.


For more details, see this discussion on [irlo](https://internals.rust-lang.org/t/repr-c-aix-struct-alignment/21594/3).

In AIX, the following struct `Floats` has the following field offsets: `[0, 64, 96]` (in bits)
Copy link
Member

@RalfJung RalfJung Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That on its own wouldn't be so bad, it just indicates that double has a required alignment of 4 bytes.

The problem is that the struct overall has a size of 24 bytes -- it gets padded to a multiple of 8, since the first data member a is considered to have an alignment of 8. (I don't know the alignment of that struct and it's non-trivial to find out given all the games with "preferred" alignment that IBM plays.)

If this was a normal target where double has alignment 4, the size of the struct would be 20 bytes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded the AIX section, so hopefully everything is correct now 🤞

@@ -89,7 +84,7 @@ Field `b` would be laid out at offset 4, which is under-aligned (since `f64` has

For more details, see this discussion on [irlo](https://internals.rust-lang.org/t/repr-c-aix-struct-alignment/21594/3).

In AIX, the following struct `Floats` has the following field offsets: `[0, 8, 12]` (in bytes)
In AIX, the following struct `Floats` has the following field offsets: `[0, 8, 12]` (in bytes) and a size of 24 bytes (since the first field has a preferred alignment of 8 bytes).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the part after the "since" here. Why would the preferred alignment affect the size?

Copy link
Author

@RustyYato RustyYato Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should say natural alignment, and that is correct since the IBM docs say

Notes:

  1. In aggregates, the first member of this data type is aligned according to its natural alignment value; subsequent members of the aggregate are aligned on 4-byte boundaries.

Although after reading this again, I think this also means that

struct Ints {
    char a;
    char b; // is this also aligned to a 4-byte boundary?
};

And this seems cursed, but maybe I'm missing something I was missing something 🤦

all these different kinds of alignment are very confusing!

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In aggregates, the first member of this data type is aligned according to its natural alignment value; subsequent members of the aggregate are aligned on 4-byte boundaries.

That footnote only applies to entries in the table marked with 1, i.e. 64-bit and 128-bit binary floats with the default alignment setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately IBM doesn't define what they mean by "natural alignment", or do they?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately IBM doesn't define what they mean by "natural alignment", or do they?

They mean align = size. I’m not sure if they say so explicitly, but based on the table, it’s clearly what they mean.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking into the various kinds of alignment, and it looks like natural alignment commonly means aligned to the size. So I assume that this is a common term that they aren't bothering to define.

sources

(citing the presidious Wikipedia 😆)

The CPU in modern computer hardware performs reads and writes to memory most efficiently when the data is naturally aligned, which generally means that the data's memory address is a multiple of the data size

source: https://en.wikipedia.org/wiki/Data_structure_alignment

We call a datum naturally aligned if its address is aligned to its size. It's called misaligned otherwise.

https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-170#alignment-and-memory-addresses

It is also implied in other sources

ARM and Thumb processors are designed to efficiently access naturally aligned data, that is, doublewords that lie on addresses that are multiples of eight, words that lie on addresses that are multiples of four, halfwords that lie on addresses that are multiples of two, and single bytes that lie at any byte address. Such data is located on its natural size boundary.

source: https://developer.arm.com/documentation/dui0472/g/compiler-coding-practices/advantages-of-natural-data-alignment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the RFC, now it says natural alignment (and provides a definition inline just in case someone doesn't know what that means - like I didn't before!)

I also clarified that the note only applied to double and long double.

@traviscross traviscross added P-lang-drag-3 Lang team prioritization drag level 3. and removed P-lang-drag-2 Lang team prioritization drag level 2. labels Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. P-lang-drag-3 Lang team prioritization drag level 3. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.